Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle changed urls #3

Merged
merged 1 commit into from
Dec 11, 2024
Merged

fix: handle changed urls #3

merged 1 commit into from
Dec 11, 2024

Conversation

JarbasAl
Copy link
Member

@JarbasAl JarbasAl commented Dec 1, 2024

files renamed to just config.json

Summary by CodeRabbit

  • New Features

    • Improved logic for determining the configuration path in the text-to-speech plugin, ensuring specific models point to the correct configuration file.
  • Bug Fixes

    • Enhanced clarity in configuration handling without impacting existing functionalities.

files renamed to just config.json
@JarbasAl JarbasAl requested a review from goldyfruit December 1, 2024 23:37
Copy link

coderabbitai bot commented Dec 1, 2024

Walkthrough

The changes in this pull request focus on the NosTTSPlugin class within the ovos_tts_plugin_nos/__init__.py file. The get_engine method has been updated to refine the logic for determining the config_path, specifically targeting certain model files to point to a unified configuration file. The download method remains unchanged, and other methods like phonemize and get_tts continue to operate without modifications. Overall, the changes improve the clarity of configuration handling while maintaining existing functionality.

Changes

File Change Summary
ovos_tts_plugin_nos/init.py Updated get_engine method to refine config_path logic for specific models; method signature modified. download, phonemize, and get_tts methods unchanged.

Poem

In the land of code where rabbits play,
A tweak was made to brighten the day.
With paths aligned, the models now sing,
A harmony found in every spring.
Hooray for clarity, let the plugins cheer,
For every change brings us closer, my dear! 🐰✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
ovos_tts_plugin_nos/__init__.py (3)

Line range hint 73-79: Add security measures for file downloads

The current implementation downloads files without proper security measures. Consider these improvements:

  1. Validate URLs against an allowlist
  2. Implement checksum verification for downloaded files
  3. Add proper error handling for file operations

Here's a suggested implementation:

@classmethod
def get_engine(cls, model_path: str, config_path: Optional[str] = None) -> Synthesizer:
-    config_path = config_path or model_path.replace("celtia.pth", "config.json").replace("sabela.pth", "config.json")
+    ALLOWED_DOMAINS = ["huggingface.co"]
+    MODEL_CHECKSUMS = {
+        "celtia.pth": "<expected_checksum>",
+        "sabela.pth": "<expected_checksum>"
+    }
+    
+    def validate_url(url: str) -> bool:
+        from urllib.parse import urlparse
+        return any(domain in urlparse(url).netloc for domain in ALLOWED_DOMAINS)
+    
+    def get_config_path(model_path: str) -> str:
+        base_path = os.path.dirname(model_path)
+        return os.path.join(base_path, "config.json")
+
+    config_path = config_path or get_config_path(model_path)
+
     if model_path.startswith("http"):
+        if not validate_url(model_path):
+            raise ValueError(f"Invalid URL domain: {model_path}")
         model_path = NosTTSPlugin.download(model_path)
+        
     if config_path.startswith("http"):
+        if not validate_url(config_path):
+            raise ValueError(f"Invalid URL domain: {config_path}")
         config_path = NosTTSPlugin.download(config_path)
+
+    if not os.path.exists(config_path):
+        raise FileNotFoundError(f"Config file not found: {config_path}")

Line range hint 73-79: Improve path handling robustness

The current string replacement approach for determining config path is brittle. Consider using proper path manipulation functions.

Here's a more robust implementation:

-    config_path = config_path or model_path.replace("celtia.pth", "config.json").replace("sabela.pth", "config.json")
+    def get_config_path(model_path: str) -> str:
+        """Get config path based on model path using proper path manipulation"""
+        if any(model in model_path for model in ["celtia.pth", "sabela.pth"]):
+            return os.path.join(os.path.dirname(model_path), "config.json")
+        return model_path.rsplit(".", 1)[0] + "_config.json"
+
+    config_path = config_path or get_config_path(model_path)

Line range hint 73-83: Add logging for configuration path resolution

To aid in debugging and monitoring, consider adding logging statements to track configuration path resolution.

+    LOG.debug(f"Resolving config path for model: {model_path}")
     config_path = config_path or model_path.replace("celtia.pth", "config.json").replace("sabela.pth", "config.json")
+    LOG.info(f"Using config path: {config_path}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 20926da and 0dc881a.

📒 Files selected for processing (1)
  • ovos_tts_plugin_nos/__init__.py (1 hunks)
🔇 Additional comments (1)
ovos_tts_plugin_nos/__init__.py (1)

73-73: Verify that both models can safely share the same config file

The change consolidates config files for both models (celtia.pth and sabela.pth) into a single config.json. Please ensure that both models are compatible with the same configuration settings.

✅ Verification successful

Let me gather more information about the config file usage and model configurations.


Let me check for any model-specific imports or dependencies that might indicate different configuration requirements.


Based on the gathered information, I can now provide the final response:

Sharing config.json between models is safe and intentional

The models can safely share the same config.json because:

  • Both models are VITS-based but handle text differently (celtia uses graphemes, sabela uses phonemes)
  • The model-specific processing is handled in the code:
    • Sabela voice uses phonemization through Cotovia
    • Celtia voice uses direct grapheme input
  • The config.json is used only for basic model initialization in the Synthesizer class
  • The README confirms both voices are supported through the same plugin configuration structure
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if models have different config requirements
for model in "celtia.pth" "sabela.pth"; do
  echo "Checking config for $model:"
  rg -A 10 -B 10 "$model.*config" 
done

Length of output: 3367


Script:

#!/bin/bash
# Check for model-specific configurations or settings
echo "Checking model configurations:"
rg -l "config" --type py

echo -e "\nChecking Synthesizer usage and configuration:"
ast-grep --pattern 'class Synthesizer {
  $$$
}'

echo -e "\nChecking config file content or structure references:"
rg -A 5 "config.*json" 

Length of output: 953


Script:

#!/bin/bash
# Check for model imports and dependencies
echo "Checking imports and model-related code:"
rg -A 5 "^from|^import" ovos_tts_plugin_nos/__init__.py

echo -e "\nChecking model initialization or setup:"
rg -A 10 "def __init__" ovos_tts_plugin_nos/__init__.py

echo -e "\nChecking for any model-specific constants or settings:"
rg -B 5 -A 5 "CELTIA|SABELA" --ignore-case

Length of output: 5853

@JarbasAl JarbasAl merged commit 9a55d3c into dev Dec 11, 2024
2 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Dec 11, 2024
@coderabbitai coderabbitai bot mentioned this pull request Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants